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Abstract — In this two-part paper, the DMT of cooperative 
multi-hop networlis is examined. The focus is on single-source 
single-sink (ss-ss) multi-hop relay networks having slow-fading 
links and relays that potentially possess multiple antennas. In 
this first part, some basic results that help in determining the 
DMT of cooperative networks as well as in characterizing the 
two end-points of the DMT for arbitrary full-duplex networks is 
established. In the companion paper, two families of half -duplex 
networks are studied. 

The present paper examines the two end-points of the DMT of 
ss-ss networks. In particular, the maximum achievable diversity 
of arbitrary multi-terminal wireless networks is shown to be 
equal to the min-cut between the corresponding source and the 
sink. The maximum multiplexing gain (MMG) of arbitrary full- 
duplex ss-ss networks is shown to be equal to the min-cut rank, 
using a new connection to a deterministic network for which the 
capacity was recently found. This connection is operational in 
the sense that a capacity-achieving scheme for the deterministic 
network can be converted into a MMG-achieving scheme for the 
original network. 

We also prove some basic results including a proof that 
the colored noise encountered in AF protocols for cooperative 
networks can be treated as white noise for DMT computations. 
We derive lower bounds for the DMT of triangular channel 
matrices, which are subsequently utilized to derive alternative, 
and often simpler proofs of several existing results. The DMT of a 
parallel channel with independent MIMO links is also computed 
here. As an application of these basic results, we prove that 
a linear tradeoff between maximum diversity and maximum 
multiplexing gain is achievable for arbitrary, ss-ss single-antenna, 
directed-acyclic networks equipped with full-duplex relays. 

All protocols in this paper are explicit and rely only upon 
amplify-and-forward (AF) relaying. Explicit codes for all proto- 
cols introduced here are included in the companion paper. 



A. Prior Work 

The concept of user cooperative diversity was introduced 
in [2]. Cooperative diversity protocols were first discussed in 
[3] for the two-hop, single-relay network (Fig |l(a)| i. Zheng 
and Tse [4] proposed the Diversity-Multiplexing gain Tradeoff 
(DMT) as a means of evaluating point-to-point, multiple- 
antenna schemes in the context of slow-fading channels. 

7 ) Two-hop Networks: The DMT was also used as a tool to 
compare various protocols for half-duplex two-hop cooperative 
networks in [5], [6]. As noted in [9], the DMT is simple 
enough to be analytically tractable and powerful enough to 
compare different cooperative-relay-network protocols. For 
any network, an upper bound on the achievable DMT is given 
by the cut-set bound [9], [49]. A fundamental question in this 
area is whether the cut-set bound on DMT can be achieved. 
While this question has been studied extensively for the two- 
hop cooperative wireless system in Fig |l(a)[ the question still 
remains open even for this class of network (see [10], [12] for 
a detailed comparison of existing achievable regions). 

In [5], the selection-decode-and-forward protocol is ana- 
lyzed for an arbitrary number of relays, where the authors 
give upper and lower bounds on the DMT of the protocol. 
In these protocols, the relays and the source node participate 
for equal time instants and the maximum multiplexing gain r 
achieved is equal to 0.5. 




I. Introduction 

In fading relay networks, cooperative diversity provides a 
means of operating the network efficiently. While much of 
the work in the literature on cooperative diversity is based on 
two-hop networks, the attention here is on multi-hop networks. 
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(a) General two-hop relay (b) Single relay network 
network 

Fig. 1. Two-hop cooperative relay networks 

In [6], Azarian et al. analyze the class of Non Orthogonal, 
Amplify-and-Forward (NAF) protocols, introduced earlier by 
Nabar et al. in [7] and establish the improved DMT of the NAF 
protocol in comparison to the class of Orthogonal-Amplify- 
and-Forward (OAF) protocols considered in [5]. It has been 
shown in [10], that the DMT of the NAF protocol can be 
obtained via the OAF protocol as well using appropriate 
unequal slot lengths for source and relay transmissions. 

The authors of [6] also introduce the Dynamic Decode-and- 
Forward (DDF) protocol wherein the time duration for which 
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the relays listen to the source depends on the source-relay 
channel gain. They show that for the single-relay case, the 
DMT of the DDF protocol achieves the cut-set bound (also 
known as the transmit-diversity bound) for r < 0.5, beyond 
which the DMT falls below the bound. An enhanced DDF 
protocol is proposed in [12] that improves upon DDF. However 
the DMT of this protocol also falls short of the transmit bound 
for r > 0.5. 

Yang and Belfiore consider a class of protocols called 
Slotted- Amplify- And-Forward (SAP) protocols in [19] for the 
two-hop network with direct link, and show that these improve 
upon the performance of the NAF protocol [6] for the case of 
two relays. Under the assumption of relay isolation, the naive 
SAF scheme proposed in [19] is shown to achieve the cut- 
set bound. It is also conjectured in [19] that SAF protocol is 
optimal even when the relays are not isolated. 

Yuksel and Erkip in [9] have considered the DMT of the DF 
and compress-and-forward (CF) protocols. They show that the 
CF protocol achieves the transmit-diversity bound for the case 
of a single relay. We note, however, that in the CF protocol, 
the relays are assumed to know all the fading coefficients in 
the system. The authors also translate cut-set upper bounds in 
[49] for mutual information into the DMT framework for a 
general multi-terminal network. 

Jing and Hassibi [8] consider cooperative communication 
protocols for the two-hop network without a direct link be- 
tween source and destination. They study protocols where the 
relay nodes apply a linear transformation to the received signal 
and analyze their BER performance. The authors consider the 
case when the source and the relays transmit for an equal 
number of channel uses and the relays perform a unitary 
transformation on the input symbols before transmitting it. Rao 
and Hassibi [28] consider two-hop half-duplex multi-antenna 
cooperative networks without direct link and provide an AF 
scheme and compute the DMT achieved by the scheme. Their 
scheme incurs a rate loss of a factor of two compared to the 
cut-set bound. In a parallel work [30], the DMT of the two- 
hop network without direct link is proved to be equal to the 
cut-set bound. 

2) Multi-hop Networks: Yang and Belfiore in [18] consider 
AF protocols for a family of MIMO multi-hop networks 
(which are termed as multi-antenna layered networks in the 
current paper). They derive the optimal DMT for the Rayleigh- 
product channel which they prove is equal to the DMT of 
the AF protocol applied to this channel. They also propose 
AF protocols to achieve the optimal diversity of these multi- 
antenna layered networks. 

Oggier and Hassibi [40] have proposed distributed space 
time codes for multi-antenna layered networks that achieve 
diversity gain equal to the minimum number of relay nodes 
among the hops. Recently, Vaze and Heath [41] have con- 
structed distributed space time codes based on orthogonal 
designs that achieve the optimal diversity of the multi-antenna 
layered network with low decoding complexity. In [42], the 
same authors study the circumstances under which full diver- 
sity can be achieved without coding in a layered network in 
the presence of partial CSIT. 

Borade, Zheng and Gallager in [27] consider AF schemes 



on a class of multi-hop layered networks where each layer has 
the same number of relays (termed as regular networks in the 
current paper). They show that AF strategies are optimal in 
terms of multiplexing gain. They also compute lower bounds 
on the DMT of the product Rayleigh channel. 

3) Capacity: There has been a recent interest in determin- 
ing approximations to the capacity of wireless networks. The 
pre-log coefficient of the capacity, termed as the degrees-of- 
freedom (DOF) of wireless multi-antenna networks is studied 
in [45] [46]. Q The DOF for the N user interference channel 
was derived in [35], for the MIMO X networks in [36], 
[37] and the DOF of single-source single-sink (ss-ss) layered 
networks was obtained in [27]. 

In a different direction, the capacity of ss-ss and multi-cast 
deterministic wireless networks has been characterized in [31]. 

Intuition drawn from the deterministic wireless networks 
was used to identify capacity to within a constant for some 
example networks in [32]. Very recently, the capacity of single- 
antenna gaussian relay networks has been characterized to 
within a constant number of bits in [33]. This result also easily 
extends to give the approximate compound-channel capacity 
for full-duplex single-antenna networks. The results in [33] 
can also be used to show that for half duplex networks, under 
any fixed schedule of operation, the best possible rate can be 
achieved (to within a constant number of bits). However the 
determination of optimal schedules that achieve the maximum 
possible DMT remains open, which we solve for certain 
classes of networks in the companion paper. 

In [34], given a wireline network code, a scheme for 
wireless gaussian relay channel is obtained where each relay 
computes linear transformations of its input signals and the 
achievable rate region for the scheme is characterized. 

4) Codes: Cyclic Division Algebras (CD A) were first used 
to construct space-time codes in [20]. The notion of space- 
time codes having a non-vanishing determinant (NVD) was 
introduced in [11]. Subsequently, it was shown in [14] that 
CDA-based ST codes with NVD achieve the DMT of the 
Rayleigh-fading channel and minimal-delay codes with NVD 
were constructed for all nt- From the results in [13], these 
codes are moreover, approximately universal, i.e., DMT opti- 
mal for every statistical characterization of the fading channel. 

These codes were tailored to suit the structure of various 
static protocols for two-hop cooperation and proved to be 
DMT optimal for certain protocols in [10]. For the DDF 
protocol, DMT optimal codes were constructed for arbitrary 
number of relays with multiple antennas in [15]. For the 
specific case of single-relay single-antenna DDF channel, 
codes were constructed recently in [16], which are not only 
DMT optimal, but also have probability of error close to 
the outage probability. Codes for the multi-antenna two-hop 
network under the NAF protocol were presented in [17]. CDA- 
based ST codes construction for the rayleigh parallel channel 
were provided in [43], [44]. This construction was shown to 

'The degrees-of-freedom is alternately refeiTed to as maximum multiplex- 
ing gain in the literature, although the former is typically used for ergodic 
capacity characterizations and the latter is typically used in the context of 
outage characterization. This paper deals with the DMT, which is a outage 
characterization and for this reason, we use the term multiplexing gain. 
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be approximately universal for the class of MIMO parallel 
channels in [15]. In this paper, we present a DMT optimal code 
design for all proposed protocols based on the approximately 
universal codes in [14] and [15]. 

5) Other Work: Cooperative networks with asynchronous 
transmissions have also been studied in the literature [54], 
[55], [56]. However, we consider networks in which relays 
are synchronized. Codes for two-hop cooperative networks 
having low decoding complexity and full diversity are studied 
in [57], [56] and [58]. While decoding complexity is not the 
primary focus of the present paper, we do provide a successive- 
interference-cancellation technique to reduce the code length 
and thereby, the complexity. 

B. Setting and Channel Model 

1) Network Representation by a Graph: Unless otherwise 
stated, all networks considered possess a single source and a 
single sink and we will apply the abbreviation ss-ss to denote 
these networks. Any wireless network can be associated with a 
directed graph, with vertices representing nodes in the network 
and edges representing connectivity between nodes. If an edge 
is bidirectional, we will represent it by two edges, one pointing 
in either direction. An edge in a directed graph is said to be 
live at a particular time instant if the node at the head of the 
edge is transmitting at that instant. An edge in a directed graph 
is said to be active at a particular time instant if the node at 
the head of the edge is transmitting and the tail of the edge is 
receiving at that instant. 

A wireless network is characterized by broadcast and 
interference constraints. Under the broadcast constraint, all 
edges connected to a transmitting node are simultaneously 
live and transmit the same information. Under the interference 
constraint, the symbol received by a receiving end is equal to 
the sum of the symbols transmitted on all incoming live edges. 
We say that a protocol avoids interference if only one incoming 
edge is live for all receiving nodes. 

In wireless networks, the relay nodes operate in either half 
or full-duplex mode. In case of half-duplex operation, a node 
cannot simultaneously listen and transmit, i.e., an incoming 
edge and an outgoing edge of a node cannot simultaneously 
be active. 

In this paper, we use uppercase letters to denote matrices 
and lowercase letters to denote vectors/scalars. Vectors and 
scalars are differentiated only through the context. Irrespective 
of whether a particular random entity is a scalar, vector or a 
matrix, the entity will be represented using boldface letters. 

Between any two adjacent nodes v^, Vy of a wireless 
network, we assume the following channel model. 

y =Hx + w , (1) 

where y corresponds to the received signal at node Vy, w is 
the noise vector, H is a matrix and x is the vector transmitted 
by the node Vx- 

2) Assumptions: We follow the literature in making the 
assumptions listed below. Our description is in terms of the 
equivalent complex-baseband, discrete-time channel. 



1) All channels are assumed to be quasi-static and to expe- 
rience Rayleigh fading and hence all fade coefficients are 
i.i.d., circularly-symmetric complex gaussian CA/'(0, 1) 
random variables. 

2) The additive noise at each receiver is also modelled 
as possessing an i.i.d., circularly-symmetric complex 
gaussian CJ\f{0, 1) distribution. 

3) Each receiver (but none of the transmitters) is assumed 
to have perfect channel state information of all the 
upstream channels in the network. @ 

C. Results 

In this paper, we characterize maximum diversity, max- 
imum multiplexing gain and achievable DMT for arbitrary 
cooperative networks. Some of these results were presented 
in conference versions of this paper [21]-[24] (see also [25], 
[26]). Special classes of networks are considered in the second 
part of this two-part paper, [1]. Optimal code design for all 
proposed protocols in both parts of the paper can also be found 
there. 

The principal results established in this paper are the fol- 
lowing (see Table IFCl for a tabular of results). 

1) The maximum diversity of a multi-antenna multi- 
terminal network is equal to the value of the min-cut 
between the source and the destination. 

2) The maximum multiplexing gain for a ss-ss full-duplex 
multi-antenna network is equal to the minimum rank of 
any cut between the source and the destination. 

3) A DMT which is linear between the maximum diversity 
and maximum multiplexing gain is achievable for full- 
duplex single-antenna relay networks. 

We also prove the following general results, that are useful 
in computing the DMT of cooperative networks 

4) The colored noise encountered in cooperative networks 
can be treated as white for DMT computations. 

5) We provide a lower bound on the DMT of triangular 
matrices. 

6) We compute the DMT of a parallel MIMO channel in 
terms of the DMT of the component MIMO links. 

D. Relation to Existing Literature 

1) Proof of a Conjecture by Rao and Hassibi: 

The results in Example 5 in Section III-E. II prove Con- 
jecture 1 of [28] and [29]. 

2) Lower bound on the DMT of various AF Protocols: 
Certain results in this paper can be used to recover 
existing results on the DMT of AF protocols in a 
simpler, concise and more intuitive manner. 

NAF Protocol: We compute a lower bound on the DMT 
of the NAF protocol, which turns out to be tight, as 
proved in [6]. 

SAF Protocol: We compute a lower bound on the DMT 
of the Slotted Amplify-and-Forward (SAF) protocol 

^However, for the protocols proposed in this paper, the CSIR is utilized 
only at the sink, since all the relay nodes are required to simply amphfy and 
forward the received signal. 
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TABLE I 
Principal Results Summary 



Network 


No of 
sources/ 
sinks 


No of 
antennas 
in nodes 


FD/ 
HD 


Direct 
Link 


Upper bound on 
Diversity /DMT 

4ound(?') 


Achievable 
Diversity/DMT 

^achieved (^) 


Is upper bound 
achieved? 


Reference 


Arbitrary 


Multiple 


Multiple 


FD/HD 


/ 


d(0) = Min-cut 


d{0) = Min-cut 


/ 

(dmax achieved) 


Theorem 13.11 


Arbitrary 


Multiple 


Multiple 


FD/HD 


X 


d{0) = Min-cut 


(f(0) = Min-cut 


/ 

(dmax achieved) 


Theorem 13.11 


Arbitrary 


Single 


Multiple 


FD 


/ 


^-max = Rank of 
iviin-cut 


J-max = Rank of 
iViin-cut 


/ 

(rmax achieved) 


Theorem 13.41 


Arbitrary 
Directed 
Acyclic 

Networks 


Single 


Single 


FD 


/ 


Concave 
in general 


C?max(l - r) + 


A linear DMT 
between dmax and 
r,nax is achieved 


Theorem 14.11 



under the relay-isolation assumption [19] in Example 2 
of Section Hl-E. II From the results in [19], this lower 
bound is in fact tight. 

N-Relay MIMO NAF Channel Appearing in [17]: 
In Example 5 of Section III-E.ll we prove an improved 
lower bound on the DMT for the MIMO NAF protocol 
for a two-hop multi-antenna network with a direct link 
compared to the bound in [17]. 

3) The diversity of arbitrary cooperative networks. 

As noted earlier, we characterize completely the maxi- 
mum diversity order attainable for arbitrary cooperative 
networks and it is shown that an amplify-and-forward 
scheme is sufficient to achieve this. Special cases of 
these were derived for the MIMO two-hop relay channel 
in [17], under a certain condition on the number of 
antennas (See Corollary 1 in that paper). Also, the 
diversity order of layered networks using amplify-and- 
forward networks is characterized in [18]. The same 
result is obtained using lower-complexity codes in [41] 
and [42]. For arbitrary ss-ss networks, upper bounds on 
the diversity order of ss-ss networks are derived in [53], 
however, no achievability results are given there. Very 
recently, [30] have characterized the diversity of general 
ss-ss networks. It must be noted that this result can 
be obtained as a special case of our result for multi- 
terminal networks, which appeared in [21], although the 
achievability strategy is different in [30]. 

4) DMT of single- antenna full-duplex networks As a con- 
sequence of the compound channel results in [33], the 



optimal DMT of full-duplex single-antenna networks 
can be proved to be equal to the cut-set bound. While 
most of the results in the current paper focus on either 
multi-antenna or half-duplex networks, it must be noted 
that the schemes presented in [33] involve long random 
coding arguments in contrast to the short block-length, 
explicit schemes presented in the present paper. 

5) Maximum multiplexing gain of cooperative relay net- 
works The maximum multiplexing gain for single- 
antenna full-duplex relay networks can be readily ob- 
tained from the results in [33] and it is potentially 
possible to extend these results to the multiple antenna 
case. We adopt however, a different approach here, 
and utilize a conversion from the deterministic wireless 
network to the fading network in order to determine the 
MMG. The conversion is operational in the sense that 
a capacity achieving strategy on deterministic network 
can be converted into a MMG-achieving strategy for the 
fading network. 

6) The DMT of the parallel channel in closed form is 
obtained in Lemma 12.51 A special case of this result is 
derived in [18] where the authors characterize the par- 
allel channel DMT for the case when all the individual 
channels have the same DMT. 

E. Outline 

In Section ini we present basic results and techniques which 
will be of use in studying the DMT of multi-hop networks. 
In this section, we introduce the information-flow diagram (i- 
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f diagram), and prove a lower bound on the DMT of lower 
triangular matrices. In Section|IIIl we characterize the extreme 
points of the optimal DMT of arbitrary ss-ss networks. We 
provide a lower bound to the DMT of arbitrary ss-ss networks 
with single-antenna, full-duplex relays in Section HV] 

In the sequel to the present paper, we will make use of the 
basic results and techniques introduced here, to characterize 
the optimal DMT of certain classes of networks. The second 
part will also provide code designs for all the protocols 
proposed in both parts of the paper. 

II. Basic Results for Cooperative Networks 

We begin by reviewing the notion of DMT in point-to-point 
channels and then go on to explain how the DMT becomes a 
meaningful tool in the study of cooperative wireless networks. 
Later in this section, we develop general techniques, which 
will prove useful in deriving results on the optimal DMT of 
ss-ss networks. 



A. Background 

1 ) Diversity-Multiplexing Gain Tradeojf: Let R denote the 
rate of communication across the network in bits per network 
use. Let p denote the protocol used across the network, not 
necessarily an AF protocol. Let r denote the multiplexing gain 
associated to rate R defined by 

R = r\og{p). 

The probability of outage for the network operating under 
protocol p, i.e., the probability of outage of the induced 
channel in (|2|l is then given by 

Pout(p, i?) = 

inf Pr(/(x;y) < ni?|H(p) = if(p)), 

> 0, Tr (S^) < np 

where H(p) denotes the collection of all random variables 
associated with the induced channel of the protocol p. Let the 
outage exponent dont{p,r) be defined by 

Pom{p,R) 



4ut(p, r) 



lim 



p-»oo log(p) 
and we will indicate this by writing 

The symbols >, < are similarly defined. 

The outage exponent do^ir) of the network associated to 
multiplexing gain r is then defined as the supremum of the 
outages taken over all possible protocols, i.e.. 



sup(iout(p,r). 

p 



A distributed space-time code (more simply, a code) oper- 
ating under a protocol p is said to achieve a diversity gain 

d(p, r) if 

P.{p,p)^p-''^^^^^ , 

where Pg {p) is the average error probability of the code C [p) 
under maximum likelihood decoding. Using Fano's inequality. 



it can be shown (see [4]) that for a given protocol, 

d{p,r) < dont{p,r). 

The DMT d{r) of the network associated to a multiplexing 
gain r is then defined as the supremum of all achievable 
diversity gains across all possible protocols and codes. 

We will refer to the outage exponent dout{r) of a protocol 
in this paper as the DMT d{r) of the protocol, since for 
every protocol discussed in this paper, we shall identify a 
corresponding coding strategy that achieves d{p,r) in the 
sequel [1] to the present paper 

Definition 1: Given a random matrix H of size x n, we 
define the DMT of the matrix H as the DMT of the associated 
channel y = Hx + w where x and y are column vectors of 
size (n X 1) and (m x 1) respectively, and where w is a 
CM{Q, I) column vector. We denote the DMT of the matrix 
H by dH{.) 

2) Cut-Set bound on DMT: On any network, the cut-set 
upper-bound on mutual information of a general multi-terminal 
network [49] translates into an upper bound on the DMT. This 
was formalized in [9] as follows: 

Lemma 2.1: Let rlog(p) be the rate of communication 
between the source and the sink. Given a cut oj between source 
and destination, let H^^ denote the transfer matrix between 
nodes on the source-side of the cut and those on the sink- 
side, and let d^{r) be the DMT of H^. Then the DMT 
of communication between source and destination is upper 
bounded by 

d{r) < miii{d^{r)}, 

where A is the set of all cuts between the source and the 
destination. 

An example of the dominating min-cut is shown in Fig. |2l 




Fig. 2. Cuts in a network. Here, the min-cut is Qi. 

3) Amplify and Forward Protocols: By an AF protocol p, 
we will mean a protocol p in which each node in the network 
operates in an amplify-and-forward fashion. Such protocols 
induce a linear channel model between source and sink of the 
form; 

y = H(p)x + w , (2) 

where y € C" denotes the signal received at the sink, w is the 
noise vector, H(p) is the (rn x n) induced channel matrix and 
X G C" is the vector transmitted by the source. We impose 
the following energy constraint on the vector x transmitted by 



6 



the source, 

Tr (E,) Tr (E{xxt}) < np, 

where Tr denotes the trace operator We will assume a sym- 
metric energy constraint at the relays as well as the source. 
Assuming the noise power spectral density to be equal to 1, p 
corresponds to the SNR for any individual link. We consider 
both half and full-duplex operation at the relay nodes. 

Our attention here will be restricted to amplify-and-forward 
(AF) protocols since as we shall see, this class of protocols 
can often achieve the DMT of a network. More specifically, 
our protocol will require the links in the network to operate 
according to a schedule which determines the time slots during 
which a node listens as well as the time slots during which it 
transmits. When we say that a node listens, we will mean that 
the node stores the corresponding received signal in its buffer. 
When a node does transmit, the transmitted signal is simply a 
scaled version of the most recent received signal contained in 
its buffer, with the scaling constant chosen to meet a transmit 
power constraint. In particular, nodes in the network are not 
required to decode and then re-encode. It turns out [6] that 
the value of the scaling constant does not affect the DMT of 
the network operating under the specific AF protocol. Without 
loss of accuracy therefore, we will assume that this constant is 
equal to 1. It follows that, for any given network, we only need 
specify the schedule to completely specify the protocol. This 
will create a virtual MIMO channel of the form y = Hx + w 
where H is the effective transfer matrix and w is the noise 
vector, which is in general colored. 

In following subsections of this section, we will develop 
techniques to handle colored noise as well as estabhsh results 
on the DMT of some elementary network connections. We will 
also establish lower bounds on the DMT of lower triangular 
matrices, which will be useful later in computing the DMT 
of certain protocols. We will also establish the maximum 
multiplexing gain for channel matrices possessing certain 
structure. 

B. White in the Scale of Interest 

In this section, we provide two results that will be exten- 
sively used in all future sections; Theorem 12.31 which states 
that noise, even though correlated, can be treated as white in 
the scale of interest and Lemma 12.41 which proves that i.i.d. 
gaussian inputs are sufficient to attain the outage exponent of 
any channel of the form y = Hx + w. 

If h is a Rayleigh random variable, then it is very easy to 
see that, for any given e and p, 

Pr{|hp > p^] < exp{-p'). 

Interestingly, a similar statement holds even when we re- 
place /i by a polynomial in several Rayleigh random variables. 

Lemma 2.2: Let {hi, h2, Hm} be a collection of i.i.d. 
Rayleigh random variables. Let / £ C[Xi,X2, ...,Xm] be a 
polynomial in the variables Xi without a constant term. Then 

^More sophisticated linear processing teclmiques would include matrix 
transformations of the incoming signal, but turn out to be not needed here. 



there exists A>Q,B>Q,d>Q,5>Q such that 

Pr{|/(hi,h2,...,hM)|' >fc} < Acxp(-Bfc^),Vfc>(5, 

where the constants A,B,d,5 are independent of k. 

Proof: See Appendix |T] ■ 

We are now ready to establish that if the noise covariance 
matrix has a certain structure, then it can be considered as 
white noise for the purpose of DMT computation. 

Theorem 2.3: Consider a channel of the form y = Hx + z. 
Let hi, h2, hi be L i.i.d., Rayleigh random variables. Let 
Gi, i = 1, 2, .., M be X matrices in which each entry is 
a polynomial function of the random variables hi, h2, h^,. 
Let z = Zq + X]f=i GJiZ,; be the noise vector for a channel 
of the form y = Hx + w. Let {z.;} be independent C7V(0, /) 
random vectors. Let the random matrix H be a function of the 
random variables hi. Then the noise vector z is white in the 
scale of interest, i.e., the DMT of the channel y = Hx + z 
is the same as the DMT of the channel y = Hx + w with w 
being a CJ\f{0, 1) random vector. 

Proof: See Appendix HIl ■ 

Lemma 2.4: [4] For any channel that is of the form y = 
Hx + w with w being white gaussian noise, i.i.d. gaussian 
inputs are sufficient to attain the best possible outage exponent 
of the channel. 

Proof: While a complete proof is available in [4], we pro- 
vide a sketch of the same proof for the sake of completeness. 
The outage probability is given by, 

Pou.(i?) = inf Pr{/(x;y I H = i7) <i?} 

S^: Tr (S^)<P 

The mutual information is a function of the channel real- 
ization and the distribution of the input. Nevertheless, without 
loss of optimality, the distribution can be chosen to be gaus- 
sian, leading to 

Poui(R) = inf Pr{logdet(/ + pHE,.Ht) < i?}. 

By bounding the eigenvalues of S^, the outage probability can 
be bounded below and above as, 

Pr{logdct(/ + -^HHt) < m 

m 

> PoutiR) > Pr{logdct(/ + pHH^) < R}. 

As p ^ oo, it can be shown that the two bounds converge so 
that we get Equation (9) in [4]), 

Pout(i?) = P(logdet(/ + pHHt) < i?). (3) 

■ 

The noise that we deal with in this paper will always satisfy 
the conditions in Theorem 12.31 Hence we will make the two 
assumptions appearing below throughout the paper; 

« the transmitted signal has an i.i.d. gaussian distribution 
• the noise is white in the scale of interest. 

C. DMT of Elementary Network Connections 
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• 


1 


• 




• 


^3 



y2 



N distinct channel matrices H'^^\h'^'^\ ...,H'^'^\ with H'^^ 
repeating in rij sub-channels, such that X^iLi '^i ~ 
Then the DMT of such a parallel channel is given by, 



d{r) 



(''li'"2, 



inf y^di(ri). 

■ .'■m): riiri=r .^^ 



(9) 



^ V 



Fig. 3. The Parallel channel with M sub-channels 



1) Parallel Network : The lemma below presents an ex- 
pression for the DMT of a parallel channel in terms of the 
DMT of the individual links. 

Lemma 2.5: Consider a parallel channel with 1\I links, with 
the ith link having representation yi = HiXi+w,, and let di{-) 
denote the corresponding DMT. Then the DMT of the overall 
parallel channel is given by 



M 



d{r)= inf ^di{ri). 

Proof: See Appendix IVl 



(4) 



The following lower and upper bounds on the outage 
exponent are immediate from 



M 

d{r) < 

i=l 
M 



(5) 



(6) 



To determine the DMT of the parallel channel when all 
component channels are identically distributed with a DMT 
that is a convex function of the rate, we will make use of the 
following Lemma from the theory of majorization [48]: 

Lemma 2.6: [48] If /(.) is a symmetric function in vari- 
ables ri, r2, . . . , r^r and is convex in each of the variables 
ri,i = 1,2, . . . ,iV, then. 



inf 

{ri,r2,--- ,rN)- J2'iLi^i=^ 



(7) 



The corollary below follows as a result. 

Corollary 2.7: The DMT of a parallel channel with all the 
individual channels being identical and having a convex DMT 
is given by: 

d{r) = Mdi[^). (8) 
The result in CoroUarv 12.71 was also obtained in [17]. 



2) Parallel Channel with Repeated Coefficients: 
Lemma 2.8: Consider a parallel channel with M links 
and repeated channel matrices. More precisely, let there be 



Proof: The proof is along the lines of the proof of 
Lemma [231 and is given in Appendix I VII ■ 

D. Maximum Multiplexing gain 

In this section, we derive the maximum multiplexing gain 
(MMG) of a MIMO channel matrix with each entry of 
the matrix being a polynomial function of certain Rayleigh 
random variables. We begin by deriving certain properties of 
polynomial functions of gaussian random variables and we 
will later use these characteristics to obtain the MMG. 

Lemma 2.9: Letp G M\X] be any non-constant polynomial, 
and let its degree be d. Consider the set 7?. of all a; £ M over 
which the following two conditions are satisfied: 

\p{x)\ < fc, (10) 

\p'ix)\>m. (11) 

This subset 7J of M can be expressed as the union 

7^ = utlR^ (12) 

of disjoint intervals i?i = [a^, 6^]. Furthermore, L < 2d. 

Proof: See Appendix Hill ■ 

Lemma 2.10: Let {xi, X2, xat} be a collection 
of independent gaussian random variables. Let 
/ e M[Xi,X2, ...,Xn] be a polynomial in the variables Xi. 
Then there exists constants ^>0,c?>0,A'>0 such that 

Pr{|/(xi,X2,...,xjv)| < <J} < AS^, yO<6<K, 

where the constants A, d, K depend only on / and not on 5. 

Proof: See Appendix II VI ■ 

We will proceed to utilize this lemma to obtain the MMG. 

Definition 2: Given a random matrix H, which is a function 
of random variables hi, h2, . . . , hjv, we define the structural 
rank of H as the maximum rank attained by H, where the 
maximum is computed over all possible realizations of the 
{hi}. We denote the structural rank of a random matrix H by 
]Rank(H). 

Theorem 2.11: Consider a channel of the form y = 
Hx + w, where H G £^nxn ^ random matrix, and x, y, w 
are A^-length column vectors representing the transmitted 
signal, received signal and the noise vector respectively, with 
the noise being white in the scale of interest. If the entries 
of H are polynomial functions of certain underlying Rayleigh 
random variables, then the maximum multiplexing gain D of 
the channel is given by, 

D = Rank(H). 
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Proof: We will prove that the MMG of the channel is 
equal to the structural rank Rank(H) =: m of H. Clearly, for 
any given H, the MMG is upper-bounded by the rank of H, 
which is lesser than m. Therefore the upper-bound of m on 
the MMG is clear. Next, we will show that a MMG of Tit is 
achievable, i.e., for any 5 > 0, a multiplexing gain of [m — 5) 
yields a non-zero diversity gain. 

Consider transmission at a multiplexing gain of r = to — (5. 
Since H is of structural rank m, there is a 7ti x m sub-matrix 
Hm of structural rank to. Then H„iH|„ is a principal sub- 
matrix of HH^ Using the inclusion principle {Theorem 4.3.15 
in [50]) and the fact that only to eigenvalues of H are non- 
zero, we obtain that, 

logdct(/ + pHHt) > logdet(/ + pH„H]„). 

Therefore, we get the outage exponent for rate r = m — 6 



a ss-ss network will turn out to posses block-lower-triangular 
structure. 

Definition 3: Consider a set of Ni x Nj matrices Aij , j = 
1,2, ...,N,i > j. Let A be the bit matrix comprised of the 
block matrices Aij in the (i, j)th position and zeros elsewhere. 



I.e., 



A 



An 
A21 A22 



A 



Nl 



A 



N2 



A 



NN 



We define the Z-th sub-diagonal matrix, A^^'' of such a 
bit matrix A as the matrix comprising only of the entries 



An, A 



{i+N-l)N 



with zeros everywhere else i.e.. 



as 



Aij if i- i = 1-1, 



-<io„t(r) 



Pr{logdet(/ + pHHl') < (r)logp} 



< Pr{log dct(/ + pH„Hj„) < (r) log p} 
p-d„..{m-S) < Pr{dct(/ + pH™Hj„) <P<""'^} 

< Pr{det(pH™Hj„) <p('"-')} 
= Pr{det(H™Ht„) < p~^) 

= Pr{|det(H™)|2 <p-*}. 

Let the random matrix H, and thereby its sub-matrix 
Hto, be a function of the Rayleigh random variables 
hi, h2, . . . , hjv. Let us denote the real and imaginary 
parts of this collection of Rayleigh random variables by 
Xi, X2, . . . , xjv, where N = 2M. Now Xj are i.i.d. 
gaussian random variables, i.e., they are distributed as 
7V(0, 1). Then |det(H,„)p is a non-zero real polynomial 
p(xi,X2, . . . ,xjv) in Xi. Since p = \ det(H„i)p is positive, 

|p(xi,X2, . . . ,XAr)| =p(xi X2, . . . ,Xjv). 

We can now use Lemma 12.101 to obtain that 

Pr{|dct(H„0|2 <p-^} = Pr{|p(xi,X2,...,XAr)| <p-«} 

< Ap-^/'^, (13) 

for some positive constants A, d, K with p~^ < K. Let Pq^ = 
K. Then we can see that ( fTsT l is valid for all p > po. 
This leads to, 

p-do.u{rn-S) < Ap-^/'^, yp>Po 

=> douf (to — (5) > 5/d 
> 0. 

Thus a MMG of m is achievable and this concludes the 
proof. 



E. A Lower Bound on the DMT of Block-Lower-Triangular 
Matrices 

In this section, we give a lower bound on the DMT of 
"block-lower-triangular"(blt) matrices, that are defined below. 
In many situations, the matrices induced by AF protocols in 



OjVi X Nj Otherwise. 

The last sub-diagonal matrix of A is defined as the sub- 
diagonal matrix A*^^^ of A, where i is the largest integer for 
which A^^) is non-zero. Thus, for example, the matrix whose 
only nonzero terms are the diagonal entries of A corresponds 
to A''^'> with £ = and the matrix whose only nonzero entry 
is Ani corresponds to A*^^^ with i = [N — 1). 

The theorem below establishes lower bounds on the DMT 
of channel matrices which have a bit structure. 

Theorem 2.12: Consider a random bit matrix H having 
component matrices Hy of size A'^; x Nj. Let M -.^^f^^Ni 
be the size of the square matrix H. 

Let H^"^ be the diagonal part of the matrix H and H*^^^ 
denote the last sub-diagonal matrix of H, as given by Defini- 
tion [3] Then, 

1) dH{r) > dijm [r). 

2) duir) > dHmir). 

3) In addition, if the entries of H^^^ are independent of the 
entries in then c?//(r) > rf^(o) (r) + d^(t) (r). 

Proof: The channel is given by y = Hx + w. Since the 
noise is white in the scale of interest, by Theorem 12.31 the 
DMT of this channel is the same as that of a channel with 
the noise distributed as CN(0,/). Therefore, without loss of 
generality, we assume that w is distributed as CN(0,/). 

For any given matrix H, the outage probability exponent 
[4] is given by 

p-'^^'-'-^ = inf Pr{/(x;y : H = iJ) < rlogp}. 

S a- '. Tr ) "^IP 

To estimate this exponent, we begin by identifying lower 
bounds on the mutual information. Note that by Lemma 12.41 
for the purposes of computing outage exponent, we may 
assume without loss of optimality that the input x is distributed 
as CA/'(0,/). We will make this assumption. 

Due to the fact that the last sub-diagonal matrix is given by 
the £-th sub-diagonal matrix, we have. 



HiiX; + Hi(j_i)Xi_i + 



H 



Starting with the mutual information term, we have. 



M 



J(x;y|H = H) = ^/(x,y|H = i^.x^^). 



(14) 
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^22 








H32 ^^3, 


■ 


H33 





'(M-2)M 
'~'(M-1){M-1) 



'(M-2){M-1) 





(a) The i-f diagram 



yM 



H22 


■ 


H33 „ 




H(M-1){M-1) 


-l' 

, • 


'"'mm 



Vm-i 



(b) Showing only the direct links 



H 



(M-2)M 



Xm • 

(c) Showing only cross links. 



Fig. 4. The i-f diagram for the block lower triangular matrix and its decomposition. 



Consider next, the following series of inequalities for all i = graph appearing in Fig. |4(c)| 

/(x,_,;y|H = i/,x,fL,+i) 



1 



J(x,;y|H = ff,xri) 

> /(x,;y,|H = if,xi-i) 
= /(x.j; H.j,,Xi + H,(i_i)Xj_i 

+ . . . + H,(,_^)X,,f + w,|H = H, x'f 1) 

+ . . . + i?i(j_£)Xj_(- + Wj|H = H, x^^"^) 

+ . . . + Fj(j_(-)Xj_£ + Wj|x'i~^) 

= /(xj; HuXi + Wj|x';^"^) 
= /(xi; i/iiXi + Wi). 

The last step follows since {xj} are independent under the 
assumed CM{0,I) distribution. We thus have, 

M 

/(x;y|H = i/) > ^/(x,;i/,,x, +w,) 

1=1 

> /(x;HWx + w|H(°) =ff(°)). 

(15) 

In the above, as is customary, whenever a variable with a 
negative index is encountered, it should be interpreted as if 
the variable were not present. From ( fTSl l, it follows that 

p-dH{r) ^ Pr{/(x;y|H = i/) < rlogp} (16) 
< Pr{/(x;H(")x + w|H(") = i/*"') < rlogp} 



I.e., 



dnir) > dH(o){r). 



(17) 



In the "information-flow diagrams" appearing in Fig. |4] the 
lower bounding of the mutual information by replacing the 
matrix H by the diagonal i/'"' can be seen to correspond 
to a pruning of the graph shown in Fig. |4(a)| resulting in the 
figure in Fig. |4(b)| 

Similarly, we have a second set of inequalities which 
correspond effectively to replacing the matrix H by the last 
sub-diagonal matrix H'-^h This corresponds to the pruned 



> /(x,_,;y,|H = i/,xf,+i) 

= I{^i-£', Hi,;Xi + Hi(i_i)Xi_i + . . . + 

Hj(i_£)Xi_f + Wi\B. = H, x,^f+i) 



H.. 



w,|H = i7,x,^,+i) 



— I{^i-e: HiiXi + i7,;(,j_i)Xi_l + . . . - 

= /(x,_<>; i7,;(j_^)Xj_i; + Wj|x^^^;^) 

= /(Xj_f ; i/,(i_i;)Xi_£ + Wj) 

^/(x;y|H-if) 

1 

- ^/(x,;y|H = i/,x,f;i) 

i=N 
l+l 

> 5]/(x,_,;y|H = i/,xi^,+i) 

i=N 
l+l 

i=N 

= /(x;HWa; + w|HW =i/W). 



Thus 



(18) 



p ^r,y., ^ Pr{/(x;y|H = i7) < rlogp} (19) 
< Pr{/(x;H(^'x + w|H(^' = H'^'^^) < rlogp} 

> dH{r) > dHU){r). (20) 



It follows therefore, from ( fTSl l and ( fTSl l that 
/(x;y|H = if) 

> max(/(x; HdX + w|Hd = -ffd), 

/(x;H(^)x + wjH('^) = (21) 
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This leads to 

p-'^"'^'-^ = Pr{/(x;y : H = i?) < rlogp} 

< Pr{max{/(x; H^o^x + w|H(") = 

/(x;H('^)x + w|H('^) = < rlogp} 

= Pr{/(x;H(°)x + w|H(°) = H^°^) < rlogp, 

/(x;H('^)x + w|H(^) = ijW)) < rlogp} 

= Pr{/(x;H(°)x + w|H^") = H^°'^) < rlogp} 
. Pr{/(x;H(^)x + w|H(^) = i/W) < rlogp} 

^ p-'^HwWp-'^ffmW 
= p-'^Htf) 

dH{r) > rfH(o)(r) + rf/f(f)(r). (22) 

where the first step comes about because of the indepen- 
dence of the entries in H(°) and H^^^ which is indeed the 
case. ■ 

Remark 1: The following two matrix inequalities can be 
deduced from the proof of Theorem l2.121 with H'"' and H'^^ 
defined as in the theorem: 

det(/ + pHHt) > dct(/ + pH(°)H(°)t) (23) 
det(/ + pHH^) > det(/ + pH(^)H(^)1'). (24) 

Remark 2: Although the result is derived for lower triangu- 
lar matrices, it also applies in a slightly more general setting. 
Consider a band matrix of the form given below, 

^ ^ 5jc 

* * * 



H 



where there are bands of non-zero entries, denoted by 
sequence of *. Let Hub and H;^ denote matrices derived from 
H, constituting of only the uppermost band and lowermost 
band respectively. They will be, respectively of the form. 



Hub 



H 



lb — 



Without affecting the DMT, the matrix H can be trans- 
formed to a bit matrix of larger size, by adding an appropriate 
number of all-zero rows at the top and all-zero columns to the 
right. Then the uppermost band of the H belongs the diagonal, 
and the lowermost band belongs to the last sub-diagonal of the 
new matrix. By invoking Theorem 12.121 for the new matrix, 
we get 

1) rfff(r) >dH„,(r). 

2) dH{r) > dn.Ar)- 

If the entries of Hu, and H„f, are independent of each other. 



then we further have, 

dnir) > rfH„t(r) +dHa{'^)- 

1 ) Example Applications of the DMT Lower bound: In this 
subsection, we derive lower bounds to the DMT of two-hop 
networks under the operation of various existing AF protocols. 
One lower bound proves a conjecture by Rao and Hassibi [28], 
while a second is tighter than lower bounds known earlier. 
In the remaining instances, although the results do not add 
to what is already known, the derivations presented here are 
surprisingly simple and provide some intuitive explanation as 
to how these protocols achieve the DMT. 

Example 1: Single relay, NAF protocol 

Consider the relay network in Fig |l(b)| Let g^;, gi, hi 
denote the channel coefficients along the links from source to 
the sink, source to the relay and relay to the sink respectively. 
The induced channel under the NAF protocol is given by, 

Wl 

W2 + hiw 

(25) 

Since two time instants are used in order to obtain the 
equivalent channel matrix, H, we have a rate loss by a factor 
of 2, and hence d{r) — d/f (2r). It can be checked that the 
noise is white in the scale of interest. Now it is sufficient to 
study the DMT of the matrix H. Let 

gd 
gd 


gihi 



yi 




gd 







Xl 


+ 


. y2 




. gihi 


gd . 




. ^1 





and 



The fading coefficients g^, gi, hi are independent and 
therefore H'^'^^ is independent of H'^^\ Invoking Theorem l2.12l 
we obtain: 

dn (r) > dnio) (r) + dnw (r) 

The diversity gains djj{o) (r) and djj[i) (r) are easily evaluated 
as, 

dnm {r) = (l - > 

dmnir) = (l-r)+. 

This leads to duir) > (l - + (1 - r)+ . 

This leads to the following estimate of the DMT d{r) of 
the protocol: 

d(r) = dH(2r) 
^d{r) > (l-r)+ + (l-2r)+. 

From [6] we know that this bound is indeed tight. 

Remark 3: For the case of NAF protocol used with N 
relays, it can be shown that Theorem 12.121 can be used to 
obtain a lower bound on the DMT of NAF protocol as 

d{r) > (1 - r)+ + iV(l - 2r)+. (26) 

This lower bound is proved to be tight for the A^-relay case 
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as well in [6]. 

Example 2: Multiple relays, SAF 

Consider the network in Fig |l(a)| with N relays. We em- 
ploy an M-slot AF protocol termed the Slotted Amplify-and- 
Forward (SAF) protocol and introduced in [19]. We assume 
that the relays are isolated from each others' transmissions (see 
[19] for a description). Each symbol transmitted by the source 
reaches the sink through the direct link, as well as through 
precisely one relayed path. For this relay-isolated case, the 
induced channel matrix for a M-slot protocol is given by a 
M X M matrix H, with g^, the fading coefficient of the direct 
link, appearing along the diagonal, and with gi, . . . ,gAr, the 
product coefficients on the different relay paths, appearing 
in repeated cyclic fashion along the first sub-diagonal. Let 
M = kN + 1 denote the slot length, for a positive integer k. 

For example, in the M = 5, N = 2, k = 2 case, the induced 
channel matrix H is given by. 



H 



Since the channel is used for AI time slots, we have the 
relation d{r) = dniMr) between the DMT of the protocol, 
d{r), and the DMT of the matrix dnir). We next proceed to 
find a lower bound on the DMT of the matrix. As before, we 
use H'"' = gd/ to denote the diagonal matrix associated to 
H. Similarly, let H^^^ denote the last sub-diagonal matrix cor- 
responding to H. This matrix contains gi, . . . , gjv repeated k 
times cyclically along the first sub-diagonal. By Theorem l2.12l 
the DMT of H can be lower bounded as, 

dnir) > d^(o) (r) + (r) 
^d{r) > dH(o)iMr) +dH(t){Mr). 

The DMT of the matrices H^"' and H^^^ can be eas- 
ily derived as, dj^m (r) = (l — jj) and dfj(e) (r) = 



Sd 














gl 


Sd 














g2 


Sd 














gl 


gd 














g2 


gd 



iV 1 - 



A/-1 



d{r) 



leading to: 



> {l-r)++N[l 



M 



M - 1 



The right hand side is in fact shown to be equal to the DMT 
of the SAF protocol in [19] under the assumption that relays 
are isolated. 

Example 3: Multiple -Antenna, Single-Relay, NAF protocol 

Consider a single-relay network with the source, the relay 
and sink equipped with multiple antennas given by jis, rir and 
Hd respectively. We follow [17] and assume operation under 
the NAF protocol introduced in [6] for the single-antenna case. 
The channel matrix turns out to be given by. 



H 





Hd 



(27) 



Rayleigh fading matrix between the source and the relay and 
an Ud X rir Rayleigh fading matrix between relay and sink. 
Proceeding in the same manner as in Example 7, we get 

d{r) > dHAr)+dH,X2r). 

This lower bound appears as Theorem 1 in [17]. 

Example 4: Multiple Antenna, Multiple relays, NAF protocol 
We consider a A^-relay network with each node in the network 

having multiple antennas. Let ns,ni and Ud denote the number 
of antennas with the source, ith relay and the destination 
respectively. 

The NAF protocol was proposed in [6] for the case of N 
relays, with all nodes possessing single antennas. The protocol 
can be viewed as using the NAF protocol for each relay 
separately (the protocol comprises of two slots, with the source 
transmitting to the relay and destination in the first slot and 
the relay and the source transmitting to the destination in the 
second slot) and then cycling through all the relays. The same 
protocol was used in the case of multiple antenna relays in 
[17]. However it is not clear that this is the optimal thing 
to do if each relay has different number of antennas. In that 
case, we might want to use the relay with more antennas more 
frequently in order to get a better performance. 

Therefore, in this example, we consider a generalization of 
the NAF protocol for multiple antenna relays, where we cycle 
through all the relays for unequal periods of time. Specifically, 
we use a NAF protocol for relay i for cycles. When we say 
we use a NAF protocol for relay i, it means that the source 
will first transmit to the relay during the first time instant 
and then in the second time instant the source and the relay 
will transmit to the destination. Thus a NAF protocol operated 
for a single relay for one cycle will take up 2 time instants. 
Let M := X) "^i- Then the protocol operates for 2M time 
slots and the induced channel matrix H of size 2M x 2M 
between source and destination will contain the direct link 
fading matrix repeated along the diagonal and the first 
sub-diagonal will have the product matrix corresponding 
to relay i repeated for 27Jii times. 

More precisely, the relay matrix H; is the product of the 
Rayleigh fading matrix between the source and the ith relay 
and the Rayleigh fading matrix Gi between the ith relay and 
the destination. Then the DMT di (r) of the product matrix 
can be computed using the Rayleigh product channel DMT in 
[18]. 

This induces an effective channel matrix between the source 
and the destination, which will be of the form: 



H 





Hi3 



(28) 



where Hd is the Ud x Ug fading matrix between source 
and the sink, Hr is the product fading matrix of an x Ug 



where Hd = ^tSSH^ , with It denoting the identity matrix 
of size T, ® denoting the tensor product and H.d denoting the 
Us X Ud fading matrix corresponding to the direct link between 
the source and the destination. H^j is now a block diagonal 
matrix with the matrix H; appearing along the diagonal for 
rrii times. Now the DMT of the protocol is given by 

d{r) = dH{2Mr) > d^w {2Mr) + d^w {2Mr) 
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by Theorem 12.121 Since H*^"^ contains the diagonal element 
as Hd repeated 2M times, dfj{o){2Mr) = dn^ir). Also 
djy(.)(2A/r) =dH„(2A/r). 

The DMT of the matrix Hb can be computed using 
the DMT of parallel channel with repeated coefficients 
(Lemma l2.8b . This gives us 

N 

duAMr) = inf ^d,(r,), 

(i-Lrs,--- .rjv): Ei^i firi=r .^^ 

where h = ^- 

Thus a lower bound on d{r) can be computed as 

N 

d{r)>dHAr)+ inf^ 

(ri,r2,--- ,rN): Y.i = i firi=2r 

Now since the activation durations {fi] for the relays are 
arbitrary, we can optimize the DMT over all possible {/i} 
such that fi = 1. 

Thus we get 



d{r) > dHAr)+ 



N 



sup inf di{ri). 

{{hj2,--- Jn) ■■ {{rijr2,--- ,rN) 



(29) 



The scheme in [17] is now a special case of this protocol 
where all the relays are used for a equal duration of time, i.e., 
fi = 1/N for all i. After substituting 9i := jj^, we get 



K 



d{r)>dHAr) + 



inf y^d^(2N9^r) 



which is indeed the formula in Theorem 2 of [17]. However the 
lower bound on DMT that we have in ( [29] l is better than the 
lower bound in Theorem 2 of [17] since we allow for arbitrary 
periods of activation which is a more general approach. 

Example 5: Multiple-Antenna, Multiple-Relay, Generalized 
NAF protocol 

Let us now consider a A^-relay network with the source 
and destination having Ug and Ud antennas and the relays 
having a single antenna each. For this network, the generalized 
NAF protocol was proposed in [28], where during the first T 
time instants, the source transmits to the N relays. Over the 
next T time slots, the relays transmit a linear transformation 
of the vector received over the prior T time slots. This 
induces an effective channel matrix between the source and 
the destination, which will be of the form: 



H 





Hi3 



(30) 



where H^i = iT^Hd, with It denoting the identity matrix 
of size T, (g) denoting the tensor product and denoting the 
Ug X rid fading matrix corresponding to the direct link between 
the source and the destination. is a Tud x Tug matrix 
which depends not only on the channel fading coefficients. 



but also on the linear transformations employed at the relays 
corresponding to the relaying path, which we is the effective 
relaying matrix. 

Now, H is bit and therefore, we invoke Theorem 12.121 to 
get, dnir) > d^(o) (r) + dff(e){r). Now the matrix H^''' 
corresponds to a block-diagonal matrix with Hu repeated 
twice along the diagonal or effectively, Hrf repeated 2T 
times along the diagonal and clearly H'-^-' = Jifi. Therefore 

dnmir) = dnAw)- 

The protocol utilizes 2T time instants to induce the effective 
channel matrix H and therefore the DMT of the protocol d{r) 
can be given in terms of the DMT of the matrix H as d{r) = 
dH{2Tr). Thus, 

d{r) = dH{2Tr) 

> d^m {2Tr) + (2Tr) 

= dHio,{2Tr) + dHA2Tr) 

= dHAr)+dHni2Tr). (31) 

We will now present this DMT inequality in the language 
of [28]. In [28], the DMT of the effective relaying matrix, Hfl 
is computed after compensating for only a rate loss of T time 
instants and let us call this as dnir), i.e., dfi{r) := dHa{Tr). 
Let us call the DMT of the direct link as d£){r) :— d//^(r). 
Now dSTI i can be re-written as 



d{r) > dD{r)+dR{2r), 
which thus proves Conjecture 1 in [28]. 



(32) 



III. Characterization of Extreme Points of DMT 
OF Arbitrary Networks 

In this section, we move on to considering multi-hop 
networks. We show that the min-cut is equal to the diversity 
for arbitrary multi-terminal networks with multi-antenna nodes 
irrespective of whether the relays operate under the half-duplex 
constraint or not. We also show for ss-ss full-duplex networks 
that the maximum multiplexing gain is equal to the min-cut 
rank. These two results put together characterize the two end- 
points of the DMT of full-duplex ss-ss networks. 

A. Representation of Mult i- Antenna Networks 

In Section U we described how a network is represented as 
a graph. The graph-representation of a network described in 
Section H] does not differentiate between the case with single- 
antenna nodes and that with multiple-antenna nodes. We make 
this distinction in a new representation of network, described 
below. We will use this representation throughout this section^ 

Consider a ss-ss wireless network with nodes potentially 
having multiple antennas. Every terminal in the network is 
represented by a super-node and every antenna attached to 
the terminal is represented by a small node associated with 
the super-node. There are edges drawn between small nodes 
of distinct super-nodes, representing communication channel 
between antennas of different terminals. Thus every edge 

''a similar representation for deterministic networlcs is used in [31], albeit 
in a context different from multiple antenna nodes. 
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is associated with a scalar fading coefficient. Since we are 
dealing with wireless networks, we assume that the broadcast 
and interference constraints hold. In effect the vector 
received by a super-node i with to; antennas can be given 
in terms of the transmitted vectors by 



E 

j e In(i) 



Hi. 



Wi, 



where yi and Wi are length column column vectors, xj 
is a rrij length vector and Hij is a rrii x mj transfer matrix 
between the super-node i and super-node j, containing entries 
with CJ\f{Q, 1) distribution. Every cut uj in the network is 
associated with a channel matrix, which we will denote by 
Hui- Fig.|6]illustrates this representation for the case of a single 
source S, two relays Ri and R2 and a sink D. 




(a) Original network with multiple antenna 
nodes 




(b) Equivalent network with single antenna 
nodes 

Fig. 5. Source and sink have 2 antennas each, relays have 3 each 



It must be noted that even wireline networks can be con- 
verted into the above model of wireless networks. This can 
be done by adding as many number of small nodes in a super 
node as the number of edges emanating from or arriving at a 
node. Then, by making the coefficients of chosen edges to zero 
(or equivalently by removing corresponding edges from the 
representation), the broadcast and interference constraints can 
be nullified. Thus the class of wireline networks are naturally 
embedded in the class of wireless networks in the above 
representation. 

B. Min-ciit equals Diversity 

Theorem 3.1: Consider a multi-terminal fading network 
with nodes having multiple antennas with edges connecting 
antennas on two different nodes having i.i.d. Rayleigh-fading 
coefficients. The maximum diversity achievable for any flow 
is equal to the min-cut between the source of the flow and 



the corresponding sink. Each flow can achieve its maximum 
diversity simultaneously. 

Proof: We first consider the case where there is only a 
single source-sink pair We will handle the case of single and 
multiple-antenna nodes separately. 

Case I: Network with single antenna nodes 

Let the source be Si and sink be Dj. Let Kij denote the 
set of all cuts between Si and Dj. 

From the cut-set upper bound on DMT (see Lemma 12.11 ). 

d{r) < min dn(r) 
^ ' ~ neA.j ^ ' 

d(0) < min do(0) 
= : min TOj^, 

where is the number of edges crossing from the source 
side to the sink side in the cut cj. So, now (i(0) < m, where 
m := minjj is the min-cut. 

It suffices to prove that a diversity order equal to m is 
achievable. We know from Menger's theorem in graph theory 
(see for eg. [51]), that the number of edges in the min-cut is 
equal to the maximum number of edge-disjoint paths between 
source and the sink. Schedule the network in such a way 
that each edge in a given edge-disjoint path is activated one 
by one. The same is repeated for all the edge-disjoint paths. 
Let the number of edges in the ith edge-disjoint path be Ui. 
The jth edge in the the ith edge-disjoint path is denoted by 
and the associated fading coefficient be h^ . Now define 
hi := rij^ihy/i = l,2,...,m. So the activation schedule 
will be as follows: en, ei2, ■ • ■ , ei(„j), 621, ■ • ■ , e2(„2), • ■ • , 
Gmi, Grai, ' ' ' , (im(nm)' whcre each edge is activated one at a 
time. The total number of time slots required for the protocol is 
N := 



''iLif^i- This in effect creates a parallel channel between 
the source Si and destination Dj . The parallel channel contains 
m links, with the fading coefficients on the link i. With this 
protocol in place, the equivalent channel seen by a symbol is 



H 



hi 
h2 

... 



This is a parallel channel with all the channels being 
independent of each other and the DMT of the channels being 
identical. Therefore we can use Corollary 12.71 and obtain the 
DMT of the parallel channel as 



dnir) 



(m — r)^ 



(33) 



This DMT can be achieved by using a DMT optimal parallel 
channel code. 

The protocol utilizes N time instants to induce this effective 
channel matrix, and therefore, the DMT of the protocol can 
be given in terms of the DMT of the channel matrix as 



dir) = dniNr) 

= {m-Nr)+. 

Hence the maximum achievable diversity is m. 



(34) 
(35) 
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Case II: Network with multiples antenna nodes 

In the multiple antenna case, we pass on to the new 
representation described in Section IIII-AI We regard any 
link; between a rit transmit and nr receive antenna as being 
composed of ntn,- links, with one link between each transmit 
and each receive antenna. Note that it is possible to selectively 
activate precisely one of the ntUr Tx-antenna-Rx-antenna 
pairs by appropriately transmitting from just one antenna and 
listening at just one Rx antenna. As is to be expected, in 
this modified representation, a cut is defined as separating 
super-nodes into two sets since super-nodes represent distinct 
terminals. With this modification, the same strategy as in the 
single antenna case can then be applied to achieve a diversity 
equal to the min-cut in the network. 




(a) Original network with multiple antenna 
nodes 




(b) Equivalent network with single antenna 
nodes 

Fig. 6. Illustration: ng = n]j = 2, ni = n2 = 3 

Fig. |6] illustrates this conversion for the case of a single 
source 5*, two relays Ri and R2 and a sink D. Having 
converted the multiple antenna network into one with single 
antenna nodes. Case II follows from Case I. For the example 
shown in the figure, the min-cut and therefore the diversity is 
equal to 12. 

Thus the proof is complete for the single flow from Si to 
Dj- 

When there are multiple flows in the network, we simply 
schedule the data of all the flows in a time-division manner 
This will entail a rate loss - however, since we are interested 
only in the diversity, we can still achieve each flow's maximum 
diversity simultaneously. ■ 

C. Maximum Multiplexing Gain equals Minimum Rank 

In this section, we determine the maximum multiplexing 
gain (MMG) for multi-antenna ss-ss networks to be equal to 
the min-cut rank (which will be formally defined later). For 



ss-ss networks with single-antennas, the MMG is lesser than 
one, because the source has a single antenna and the cut with 
source at one side and the rest of the nodes on the other side 
will yield an upper bound on MMG as one. It is possible to 
attain the optimal MMG of 1 by activating one path between 
the source to the destination either using amplify-and-forward 
or a decode-and-forward strategy. However, the MMG-optimal 
strategy becomes unclear when the number of antennas is 
greater than 1. 

We use results from a recent work on deterministic wireless 
networks [32] to arrive at strategies for achieving the maxi- 
mum multiplexing gain of a fading network. The achievability 
strategies for deterministic wireless networks are lifted to fad- 
ing networks using simple algebraic techniques. We begin with 
discussing a new representation for ss-ss networks, potentially 
having multiple antenna nodes, which will be used in this 
section. 

1) Linear Deterministic Wireless networks : In defining 
deterministic!! wireless networks, we follow [31]. Every ter- 
minal in the network is represented by a super-node and 
each node possesses q small nodes associated with the super- 
node. All operations take place over a fixed finite field Fp. 
There are edges drawn between small nodes of distinct super- 
nodes, representing communication channel between antennas 
of different terminals. Since we are dealing with deterministic 
wireless networks, we assume that the broadcast and interfer- 
ence constraints hold. In effect the vector received by a 
super-node i can be given in terms of the transmitted vectors 
Xj of various nodes by 

yi= Yl "^y^j' 

J 6 In(i) 

where yi and xj are q length column column vectors in Fp, 
and Gij is a to^ x rrij transfer matrix between the super- 
node i and super-node j, taking values in Fp. Every cut oj in 
the deterministic network is associated with a channel matrix, 
which we will denote by G^. 

The network model of linear deterministic networks thus 
described has close similarities with representation of ss-ss 
fading networks described in Section UlI-AI with the multiple 
antennas taking the place of small nodes in the case of 
fading networks. The difference between the two are only that 
deterministic network has noise-free links in comparison to the 
noisy links in the fading case, and that every edge coefficient 
is a finite field element in the deterministic network, in place 
of complex fading coefficient. In deterministic networks, each 
node transmits a (/-tuple over the finite field. The theorem 
below from [31], computes the capacity @ of a ss-ss linear 
deterministic wireless network. 

Theorem 3.2: [31] Given a linear deterministic ss-ss wire- 
less network over any finite field Fp, V e > 0, the capacity C 
of such a relay network is given by, 

C = mill mnk( Guj), 
uj e n 

^By deterministic network, we will always mean linear deterministic 
network. 

^We use the term capacity to signify e-error capacity, as is conventional. 
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where the capacity is specified in terms of the number of finite 
field symbols per unit time. A strategy utilizing only linear 
transformations over Fp at the relays is sufficient to achieve 
this capacity. 

The capacity-achieving strategy in [31] utilizes matrix trans- 
formations of the input vector received over a period of T 
time slots at each relay. This process continues for L blocks, 
therefore the total number of time instants required for the 
scheme is M LT. The achievability shows the existence 
of relay matrices Ai at each relay node i G |V|, where V is 
the set of vertices in the graph. A.^ is of size qT x qT, and it 
represents the transformation between the received vector of 
size qT to the vector of size qT that is transmitted. 

The multi-cast version of Theoreii j3.2l is reproduced below: 
Theorem 3.3: [31] Given a linear deterministic single- 
source Z3-sink multi-cast wireless network, V e > 0, the 
capacity C of such a network is given by, 

C = niin min rankfG^^). 

j=l,2,..,D uj£ ilj 

where fl is the set of all cuts between the source and 
destination j. A strategy utilizing only linear transformations 
at the relays is sufficient to achieve this capacity. 

2) MMG ofss-ss networks: The main result of this section 
is given below. 

Theorem 3.4: Given a ss-ss multi-antenna wireless network, 
with Rayleigh fading coefficients, the MMG of the network is 
given by 

D = min Rank(H^). 

e o 

An amplify-and-forward strategy utilizing only linear trans- 
formations at the relays (that do not depend on the channel 
realization) is sufficient to achieve this MMG. 

Proof: (Outline) The proof proceeds as follows: 

1) First, a converse for the MMG is provided using simple 
cut-set bounds. 

2) Then, we convert the fading network into a deterministic 
network with the property that the cut-set bound on 
MMG for the fading network is the same as the cut- 
set bound on the capacity of the deterministic network. 

3) We then characterize the zero-error capacity of the linear 
deterministic wireless network. 

4) Finally, we convert a capacity-achieving scheme for the 
deterministic network into a MMG-achieving scheme for 
the fading network, which matches the converse. 

■ 

The outline of the proof given above is detailed below. A 
converse on the degrees of freedom of a ss-ss fading network 
is immediate and is formalized in the following lemma. 

Lemma 3.5: Given a ss-ss fading network with i.i.d. 
Rayleigh fading coefficients, the MMG, D, is upper bounded 
by the MMG of every cut: 

D < min Rank(H„), (36) 

oj e A 

where A denotes the set of all cuts in the network, and Uui is 
the matrix corresponding to the cut G A. 



Next we proceed to the achievability part of the proof. 
First, we convert the wireless fading network into a derived 
linear deterministic networkQ The construction of the derived 
deterministic network is described below. We will show that 
the zero-error capacity of this derived deterministic network is 
lower bounded by the upper-bound on the MMG of the fading 
network. 

Let the number of edges in the fading network be N. 
Fading coefficients associated with edges of the network are 
denoted by hi, h2, hjy. To construct the derived determin- 
istic network, consider a deterministic network with the same 
topology as that of the original fading network. We take q, 
the vector length in the deterministic network to be equal 
to the maximum number of antennas of any node in the 
fading network. For nodes with number of antennas less than 
q, we leave the remaining nodes unconnected. We still need 
to decide the finite field size, p, and finite field coefficients 
on all edges to completely characterize the equivalent finite- 
field deterministic network. We shall denote these finite-field 
coefficients by ^i, i = 1,2, . . . , N. We shall consider {^i} as 
indeterminates, before values are assigned to them. 

For determining the field size p and {^i, i = 1,2, . . . , N}, 
we will impose further conditions. In particular, we will ensure 
that the deterministic network will have at least the same 
capacity as the upper bound on MMG for the fading network. 
Due to the similarity between the expression for capacity in 
Theorem l3.2l and MMG terms in Lemma |331 above condition 
can be met by making sure that, cut-by-cut, the rank of the 
transfer matrix {G^) in the deterministic network is at least 
as large as the structural rank of the transfer matrix H^^, i.e., 
rank(G^) > Mank(H^). 

Let us fix a cut uj, and let r^j := ]Rank(Htj) be the structural 
rank of the transfer matrix of the cut in the fading network. 
Then, there exists a r^j x r^^ sub-matrix (say H^^) of H^^, 
which has structural rank r^^. Consider the same cut on the 
deterministic network and find the corresponding x 
sub-matrix of the transfer matrix G^j. Now consider the 
determinant of the matrix G^. The determinant is a polynomial 
in several variables = 1,2,..,N with rational integer 
coefficients. Let us call this polynomial as ^2, ■•, ^w)- 

This polynomial is not identically a zero polynomial over Q. 
This is because if it had been, then the substitution of = hi 
will also yields zero irrespective of the choice of hi, making 
the determinant zero even for the gaussian case, leading to a 
contradiction. Therefore, /^^ is a non-zero polynomial. We also 
observe that the degree of in each of the variable is at 
most one. The lemma below, easily proved using elementary 
algebra, shows that it is possible to identify a finite field Fp 
and an allocation to {^^ } with numbers from ¥p such that 
does not vanish. 

Lemma 3.6: Given a polynomial f {£,1,^,2, ■■■,£,n) with in- 
teger coefficients, which is not identically zero, there exists a 
prime field ¥p with p large enough, such that the polynomial 
evaluates to a non-zero value at least for one assignment of 
field values to the formal variables. 

'it must be noted that the conversion to deterministic networlc used here is 
different from that used in [32] and [38]. 
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However we want ensure the above condition for every cut 
in the network. To do so, consider the polynomial 

:= n (37) 

Now, the polynomial / is non-zero since it is a product of non- 
zero polynomials /^^ and the degree of / in any of the variables 
is at-most We want a field Fp and an assignment for 
from the field such that / is nonzero. Using Lemma ITSl such 
an assignment exists. Let us choose that p and the assignment 
that makes / non-zero. Thus we have a deterministic wireless 
network whose capacity is guaranteed to be greater than or 
equal to the MMG upper bound, given in ( [36] l. 

Next, we prove that the zero error capacity, Cze, of a linear 
deterministic network is equal to its e-error capacity. 

Definition 4: [39] The zero error capacity of a channel is 
defined as the supremum of all achievable rates across the 
channels such that the probability of error is exactly zero. 

Theorem 3.7: The zero error capacity of a ss-ss determin- 
istic wireless network is equal to 

CzE = min rankfGc^) 

Lo e n 

This capacity can be achieved using a linear code and linear 
transformations in all relays. 

Proof: We will prove this theorem using the e-error 
capacity result from Theorem 13.21 Let the ss-ss deterministic 
network be composed of M relay nodes. From the achievabil- 
ity result in the proof of Theorem l3.2l given any e > and rate 
r < C, there exists a block length T, number of blocks L, set 
of linear transformations Aj, j = 1,2, ...,M of size qT x qT 
used by all relays and a code book C for the source, such that 
the average probability of error, Pg, is less than or equal to 
e. Each codeword Xi G C is a gT x 1 vector that specifies 
the entire transmission from the source. Let xi, ■■■,x^c\ be the 
codewords. 

Let us assume that the sink listens for L > L blocks in 
general to account for the presence of paths of unequal lengths 
in the network between source and sink, (for large L, we would 
have ^ 1, so this does not affect rate calculations). Let 
M := LT and M := L T. The transfer equation between the 
source and the destination vectors are specified by, y = Gx 
since all transformations in the network are linear Here G is 
a qM X qM matrix, x is the M-length transmitted vector, 
and y is the M -length received vector 

Given the transmitted vector Xi corresponding to a message 
mi at the source, the decoder either makes an error always 
or never makes an error This is because the channel is a 
deterministic linear map, and error is only due to the fact 
that Xi and Xj are mapped to the same vector at the decoder 
Let be the probability of error, conditioned on the fact 
that the i-th codeword, x^ is transmitted. Then Pj e {0,1} 
according to the argument above and the average codeword 
error probability 

-, |C| |C| 



This means that at least (1 — e)\C\ codewords have zero 
probability of error. Therefore if we choose only these (1 — 
e)\C\ codewords as an expurgated code C , then the code 
has zero probability of error under the same relay matrices 
and decoding rule. The rate of the code is however f ~ 
r - '°g^(yr^} ^ Let S = be the rate loss and 

therefore, the expurgated code has negligible rate loss as M 
becomes large. Now, we have established a zero error code of 
rate r — (5. By choosing r arbitrarily close to C and M large, 
we get CzE = C. 

The code C, as given in [31] is non-linear, and so is the 
case of C . However, we can obtain a linear code with zero 
probability of error Since there exists a zero error code for 
rate f with block length T and number of blocks L, it means 
that the transfer matrix G between the source and the sink 
has rank at least rM. Hence G has a sub-matrix G of size 
rAl X fM, which is of full rank. By activating appropriate 
nodes, we can obtain the effective transfer matrix to be G . 
In that case, a linear code of rate f which communicates only 
on the fM subspaces can be used to achieve zero error. ■ 

Thus, for a given fading network, we have constructed an 
equivalent deterministic network. In the equivalent network, 
we also have a zero-error achievable rate r using a linear 
code of block length T and L blocks with linear strategies 
at the relays. This achievable rate is related to the MMG of 
the original fading network as follows: 

f + 5 = CzE 

= min rankfGoi) 

u e n 

> min RankfHoj) 

wen 

Further, the positive constant 5 can be made as small as 
possible as we wish by increasing the block length T. Now, 
when we use the zero-error scheme detailed above, the transfer 
matrix G of size qM x qM between the input and the output 
vectors x and y is at least of rank fM k, CM. 

Finally, we lift the achievability strategy of zero-error 
capacity in the equivalent deterministic networks to arrive 
at an achievable strategy for MMG in corresponding fading 
network. 

In the reduced deterministic network of a fading network, 
to achieve the zero-error capacity, the relays perform matrix 
operations Ai on received vectors for T time durations. Since 
each received vector is of size q, the matrix Ai is of size 
qT X qT. Now we use the same strategy for the fading network, 
i.e., all relays use the same matrices Ai, that are obtained via 
the zero-error strategy in the reduced deterministic network. 
Though the entries of Ai belong to Fp, they can be treated as 
integers by identifying the elements of Fp with the integers 
0,1,,. ..,{p — 1). Therefore the matrices Ai can also be 
interpreted as matrices over C. By using linear maps Ai at 
relays in the fading network, we get an induced channel matrix 
H, and effective channel would be of the form, y = Hx+w. 
As is shown in Theorem 12.1 II MMG offered by this channel 
is equal to Kank(H). We shall prove that MMG offered by 
this induced channel is greater than or equal to fM, i.e., to 
show that Rank(H) > fAL That is equivalent to show that 
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there exists an assignment of h; = hi in the fading network 
such that rank(i7) > fM. 

In the proof of Theorem 13.71 we restricted the operation of 
the derived deterministic network to create a transfer matrix G 
of size qM x qM with rank greater than or equal to fM. Now 
we have a similar transfer matrix H in the fading network. If 
we assign the underlying random variables hi to be equal to 
again by identifying the elements of Fp with the integers 
0, 1, , {p~ 1), we have an assignment of H that has rank at 
least fM. Since the strucutral rank is the maximum possible 
rank under any assignment, we get that. 



£ank(H) 



> 



rank(G) > fM. 



The induced channel therefore has a MMG equal to fM by 
Theorem 12.111 Since the network is operated for M ~ LT 
time slots in order to obtain a MMG greater than or equal to 
fM, the MMG of the network per time slot is greater than or 
equal to f. By increasing the block length T and the number 
of blocks L, the achievable MMG can be made arbitrarily 
close to CzE- Thus the upper bound given in Lemma [331 is 
achieved, and hence MMG of ss-ss fading network is given 
by 



D = 



mm 

{lo e A} 



Ha). 



D. MMG for Multi-casting 

In this section, we extend the result on MMG to the multi- 
casting scenario. 

Theorem 3.8: Given a single-source D-sink multi-cast 
gaussian wireless network, with Rayleigh fading coefficients, 
the MMG of the network is given by 



D = mill mill MankfH^j). 

{j=i,2,...,D} Lj e iij 



(38) 



An amplify-and-forward strategy utilizing only linear transfor- 
mations at the relays is sufficient to achieve this MMG. 

Proof: The proof uses Theorem 13.31 goes in the similar 
lines of that of Theorem 13.41 and is omitted here for brevity. 



IV. DMT Bounds for Single Antenna Relay 
Networks 

In this section, we consider ss-ss networks equipped with 
full-duplex single-antenna nodes. We provide a lower bound to 
the DMT of such a network by exploiting Menger's theorem. 

Definition 5: Consider a network N and a path P from 
source to sink. This path P is said to have a shortcut if there 
is a single edge in N connecting two non-consecutive nodes 
in P. 

Theorem 4.1: Consider a ss-ss full-duplex network with 
single antenna nodes. Let the min-cut of the network be M. 
Let the network satisfy either of the two conditions below: 

1) The network has no directed cycles, or 

2) There exist a set of M edge-disjoint paths between 
source and sink such that none of the M paths have 
shortcuts. 



Then, a linear DMT d{r) ~ M{1 — r)+ between a maximum 
multiplexing gain of 1 and maximum diversity M is achiev- 
able. 

Proof: Given that the network has min-cut M, there are 
M edge-disjoint paths from source to sink by Menger's the- 
orem [51]. Let us label the edge-disjoint paths ei,e2, ...,ej\/. 
Let the product of the fading coefficients along the path be 
gi. Let Di be the delay of each path. Let D = iiiaxD;. We 
add delay D — Di to the path e; such that all paths now are 
of equal delay. We activate the edges as follows: 

1) Activate all edges in the edge-disjoint path ei simulta- 
neously for a period T, where T >> D. The source, on 
the first T — D activations, will transmit (T — D) coded 
information symbols, followed by a sequence of D zero 
symbols. The reason for this will become clear shortly. 
The net effect will be to create a {{T - D) x (T - D) 
transfer matrix Hi from the (T — D) source symbols 
to the last (T — D) symbols received by the destination 
(the first D symbols received by the destination are all 
zero). 

The matrix Hi will be either upper- triangular or lower- 
triangular, with the elements along the diagonal all 
equal to the path gain gi on path ei, according to 
whether the condition 1 or condition 2 of the theorem is 
satified. First, we explain the case when the graph has 
no directed cycles. In this case, off-diagonal terms above 
the diagonal can arise due to the presence of short-cuts. 
However, off-diagonal terms below the diagonal would 
constitute a directed cycle and will thus not appear. 
Therefore the matrix will be upper-triangular in this 
case. Next, for the case when the graph has no shortcuts, 
off-diagonal terms below the diagonal can arise due 
to the presence of cycles in the graph. However, no 
terms above the diagonal will be present because of the 
presumed absence of shortcuts. Thus the induced matrix 
will be lower-triangular in structure. 

2) Repeat Step 1 for all edge-disjoint paths 62, e^/. The 
net transfer matrix H will be block diagonal of the form 



H 



Hi 



Ho 



H 



M 



(39) 



composed of M blocks along the diagonal, one corre- 
sponding to each path, and either all of them are upper 
triangular or all are lower triangular by the argument 
above. 

Now, if d{r) is the DMT of the network operating under 
the protocol given above, then 



d{r) = dniMTr), 



(40) 



since MT time instants were used up by the protocol in order 
to obtain the indcued channel matrix H. We next proceed to 
lower bound dnir). By Theorem 12.121 we have the lower 
bound 



(41) 
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where H^"^ is the matrix corresponding to the diagonal terms 
in H. Next, we observe that H'"' corresponds to a parallel 
channel with M fading coefficients gi, g2, • ■ ■ , gA/, each of 
them repeated T ~ D times. We can compute the DMT of this 
parallel channel using Lemma IZSl Thus we get, 



(42) 



(|46] |. we have. 



dHm{r) = M[l- 
d{r) 



M{T - D)^ 
dniMTr) > dHm{MTr) (43) 



Pr||/,(hi,h2,...,hM)r > 
= Pr <^ ,9i(ui,U2, . . . ,Ua/) > 



r <?2 



d{r) 



M 1 



MTr 



M{T - D) 

For T tending to cxo, we get d{r) > A/(l — r)^ 



(44) 



k \ 



5:pi- u 



k 



Appendix I 
Proof of Lemma [2~2] 



Let the multinomial / be written as a sum of S monomials 
/(Xi, X2, Xm) ■■= Ef=i cJ,iXi,X2, Xm), where for 
every i, Ci is a constant and fi is a monomial, i.e., is comprised 
only of product of powers of Xi. Then for every assignment, 

Xi = hi. 



\f{hi,h2,-,hM)\ < ^ |ci||/j(/ii,/i2, 



Now we have, 

Pr{|/(hi,h2,...,hM)|'>fc} 

= Pr{|/(hi,h2,...,hM)| >fc5} 

< Pr{^|Q||/,(hi,h2,...,hM)| > fcH (45) 

i 

< Prju |^|c,||/,(hi,h2,...,hM)| > ^) } 

< ^Pr||c,p|/,(hi,h2,...,hM)P>^| 



< ^Pri|/,(hi,h2,...,hM)P > 



(46) 



where Cmax is the maximum over all {|cip}. Now 
/i2, /iA/)P is a monomial in \hi\'^ as well. Define, 
Uj |hjp. Then Uj is the squared norm of a CA/'(0, 1) 
random variable hj, and therefore has an exponential distri- 
bution. We will regard |/i(hi, h2, hA/)P as a monomial 
5i(ui,U2, ...,um) in {uj}. Thus 



M 

gj(ui,U2, . . . ,Um) = Y[ 

J = l 

where < a^j < is an integer, where D is the maximum 
degree of any of the monomials gi in any of the variables h;. 

Now gathering a single term in the summation in RHS of 



M Pr <^ u,- > 



-max 

k ^ 



J " I „ C2 



M exp — 



■max^ 

k ^ ™^ 



r <?2 

*-vaax^ 



Equation WT\ follows if 



(47) 



(48) 



We get this condition by setting S := CmaxS'^ > 0, since by 
the hypothesis of the lemma, we have k > S. 
Combining ( l46b and ( |48] |. we get, for k > S, 

Pr{|/(hi,h2,...,hM)|' >fc} 

/ 1 \ 

\ AID 



< MS exp - 



A exp ( -Bk-i 



where A := MS*, B := ( ^ ) "'" , (5 = CmaajS"^ and 
d := MD. 

Appendix II 
Proof of Theorem|2.3I 



Pr(logdet(/ + /?HHtl]-i) <rlogp) 
= Pr (log dot (/ + pH.Yi^) < r log p) . 

Let the correlation matrix of the noise vector be denoted 
by S. The noise co variance matrix S depends on the channel 
realization and is therefore a random matrix, given by. 



(49) 



Let \i{A), \max{A) and \min{A) denote the largest, max- 
imum and minimum eigenvalues of a positive semi-definite 
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matrix A. If the context is clear, we may avoid specifying the 
matrix, and just use Xi, X,nax and Xmin respectively. 

To prove the lemma, we will use the Amir-Moez bound on 
the eigen values of the product of Hermitian, positive-definite 
matrices [52]. By this bound, for any two positive definite 
n X n Hermitian matrices A, B: 

A,(A)A,„„,(B) < K{AB) <A,(A)A 

max 

So we get, 

dei{I + pAB) = + pX,{AB)) 

i 

< + pX,{A)Xn,.^{B)) 

i 

= dct(/ + /7An,ax(S)A). 

Similarly, 

det{I + pAB) > dct(/ + /9Amin(S)A). 

Therefore, for any two positive definite n x n Hermitian 
matrices A, B, 

dctil + pXmin{B)A) 

< dct(/ + pAB) < dct(/ + pAn,ax(S)A). (50) 

Applying (jSOll to A ^ HW and B = we get 

=^det(/ + pHHtA„„„(S-i)) 

< dct(/ + pHHtS"i) (51) 
<dct(/ + pHHtA™,,(S-i)). (52) 

Continuing from ( fSTT i and ( |52] i. we have 

Pr{log(dct(/ + pFFtA„„„(£-i))) < Hogp} 

> Pr{logdct(/ + /5iJi/tS"^) < rlogp} 

> Pr{log(dct(/ + pi/i/U,„,,(S-i))) < rlogp}. 

(53) 

In the following, we will prove that both the bounds coin- 
cide as p ^ oo. We begin with the bounds on Ami„(S) 
and XmaxC^)- In order to show that the lower and the 
upper bounds on the expression converge to the value 
Pr{log(dct(/ + pHW)) < rlogp}, we need to provide 
a lower bound for each Ai(S). Let be the eigen vector 
corresponding to Ai(E) for every reaUzation E of S. Then, 



A,;(S) 




G,GJ 



> 



II f W Eg.gI 

\i=l / 

II r 



^Ai(S) > 1 Vi 

=> Amm(S) > 1 



(54) 



Hence, Pr{logdct(/ + pHRtS-i) < rlogp} 

> Pr{logdet(/ + pHHtA„a.(S"')) < rlogp} 

> Pr{logdet(/ + pHHt) < rlogp} (55) 

(56) 



Now we proceed to get an upper bound on XmaxC^)' 



M 



Xmax{^) — Amaa;(-^ + G^Gj ) 

i=l 
M 

= 1 + A 

max 



I.e., 



M 



Amax(S) < l+Tr(EG,Gl) 

1=1 

M 

= l + ETr(G,Gt) 

1=1 

M 

= 1+EiiG. 



l2 

'■ill p 



i=l 
M 



i=i j=i 



(57) 



where represents a polynomial entry of the matrix Gj . Let 
xi,X2, ...,X2L denote in some order, the real and imaginary 
parts of hi, h2, hi. Then, the right hand side in (ISTl i is a 
polynomial in the variables, x^, i = 1,2, 2L. 



This leads to the following inequality, 

XmaxC^) < g(xi,X2, . . . ,X2l) + 1, 



(58) 



where g{xi,X2, ■ ■ ■ ,X2l) is a polynomial without constant 
term in the variables {xi}. Let us invoke Lemma IZ21 for the 
polynomial g which does not possess any constant term. The 
lemma is valid for all k > S, where 6 depends on g. Let us 
choose po such that Pq — 1 > S and therefore Vp > po, we have 
that — 1 > (5. Now, V(0 > po, we can invoke Lemma 12.21 
and get, 

Pr{A,„a,(S) > p^} < Pr{.g(xi,X2,...,xs) >p'-l} 
< AeM~Bip' -l)^), (59) 
for some constants A, B, d > 0. 



Let H denote the set of all the fading coefficients in the 
network, and let h ^ H denote a realization of the fading co- 
efficients. Thus h will be a vector specifying hi,i = 1, 2, L. 
Clearly, once a h is given, the values of the matrices H and 
Gi are all well defined, since all of them depend only on hi. 



Let A = {h e H \ logdet(J + pHH'^Y.-^) < p''} and 
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B = {hen 

Py{A) = 



A, 



Now, A 



Clearly, either of the two conditions ( fTOl i or ( fTTT i is violated 
just to the right of the point x — b, else, the interval would 
extend beyond b. We consider two cases. 
Case 1: \p{b)\ > k. 
(50) Condition ( fTOl i is violated in the region in the immediate 
right of the interval [a, b]. This implies that the absolute value 
^ of the evaluation of polynomial function, has to be 

logdct(/ + pHH Amm(S )) < p } greater than k in the beginning of the interval [b, c]. Also, we 
logdct(/ + pH H\\max{^)) ^) < jo' }know that within [a, b] and [c, li], \p{x)\ is strictly less than k. 
logdet(/ + p^^'^HH'') < p''}. (61)^^i^ happen in two ways, as shown in Fig. |7](the other 

possibility is that the polynomial can be the negative of that 
Substituting (HB and ^ in taking logarithms and shown in the figure, in which case the same argument holds), 
dividing by logp on both sides, we have. In either of these ways, the function has to go through -k 

logPr(A) value twice in [b,c]. Therefore, by Rolle's theorem, p'{xo) ~ 



< 
< 

C 



(E) > p'^} be two events. Then, 

Pr(yl n B") + Pt{A n B) 

Pr(A n B") + Pr(B) 

Pt{A n B") +Aexp{-B{p' - 1)^) 

{hen\ logdct(/ + pHH^T.-^) < p'} 

{hen 
{hen 
{hen 



logp 



0, for some b < xq < c. 



< 



< 



log[Pr (A n B") + Pr{B)] 

log/3 

log [ Pr{ri e n I logdet(/ + p- 
+ Aexp{-B{p' - 1)^) 



P(x) 



log/9 



'HH^) < p^} 
(62) 



(a) Case 1(a) 



lim 



logPr(A) 
log/0 



P(x) 



< 



lim 



log[Pr{;i G n I logdet(/ + p^-'HH'i) < p''}] 



logp 

The last equation follows from (|62] |. since the first term in the 
RHS of i6% varies inversely with an exponent of p whereas 
the second term is exponential in p, and therefore the sum is 
dominated by the first term. After making the variable change, 
p = p^~'^ and replacing p by p , we get 

logPr{logdet(/ + pHHtS-i) < p''} 
^ lim 

p^oo log p 




Fig. 7. 



(b) Case 1(b) 
Condition H0\ is violated in [b, c] 

P'(x) 



lim 

p — >oo 

(1- 



logPr{logdet(/ + pHHt) < p't^'} 



logp 



(a) Case 2(a) 



(63) 



P'(x) 



In ( |63] |. e is arbitrary, and we let it tend to zero. Hence, by ( |63] | 
and ( [56] l. the exponents for both the bounds in ( |53] | coincide 
and we obtain, 

Pr{logdet(/ + pHHtl]-i) < rlogp} 
= Pr{logdet(/ + pHH^) < rlogp}. 

This proves the assertion of the theorem. 

Appendix III 
Proof of Lemma [2T9] 




Fig. 8. 



(b) Case 2(b) 
Condition (TT) is violated in [b, c] . 



Consider any two intervals i?i , i?2 G If we are not able 
to find two such intervals, then clearly L < 1, and we are 
done. Let i?i = [a, 6] and R2 = [c,d], and without loss of 
generality assume that a < b < c < d, since they are, by 
hypothesis, disjoint. First, we claim that there exists a point 
b < Xq < c, such that either p'{xq) = 0, or p"{xo) — 0. We 
now proceed to prove this claim. 



Case 2: Condition (fTTI) is violated at the end of interval 

[a, 6] 

This implies that the absolute value of the evaluation of 
the derivative of p{x), i.e.,|p'(x)|, diminishes below m in the 
beginning of the interval [b,c]. Also, we know that within 
[a, b] and [c, d], \p'{x)\ is greater than or equal to m. This can 
happen only in two ways, as shown in Fig. [8] In the first case, 
p'{xo) = 0, for some b < xq < c. In the second case, the 
function p'{x) takes the same value +m twice in [b,c], and 
hence by Rolle's theorem, p"{xo) = 0, for some b < xq < c. 

By the above claim, for any two arbitrary intervals in 
7?., there exists real root of p'{x) or p"{x) between those 
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two intervals. Since the number of roots of a polynomial is 
bounded by its degree, there will be only finitely many such 
intervals. In particular, the number of intervals L is bounded 
by 2d, which is an upper bound on the total number of zeros 
of p{x) and p [x). 



Appendix IV 
Proof of Lemma IITToI 



For any polynomial / in several gaussian random variables, 
we have that 

Pr{/(xi,X2,...,xjv) ^0} = 1. 
This follows since letting 

y = {xi,X2, ■ ■ ■ ,XN-l) 



we can wnte 



S = {xn \ fixi,X2,- ■ ■ ,Xn) = 0}, 



we see that 



Pr{/(xi,X2, ...,X7v) = 0} = 
/(y) / fixN/y)dxNdy = 



0, 



because the innermost integral equals zero as S is finite given 
a particular assignment of y, i.e.. 



f{xN/y)dxN = 0. 



(64) 



Let X := {xi, X2, xat}. Let us define an indicator 
function /^(x) as follows: 



1, |/(xi,X2, ...,XAr)| < 5 

0, else 



Then 

Pr{|/(xi,X2,...,XAr)P < S} 



(65) 



^Xl,X2,...,XJV-l IExjv{-^fc(x) I Xj^ } 



= Ex /fe(x) 
= ............ 

= IExi,x2,...,x„_i Pr {\f{xi,X2,.:,XN-l,^N)\'' < S}. 

(66) 

Let /(xn) f{xi,X2,---,XN-i,^N), where the depen- 
dence of / on the first — 1 variables is made implicit. Let 



/(xat) = ^6fcx 



fc 



fe=0 



where d^r is the degree of the polynomial / in the variable a; at. 
Since 6d„ is a polynomial in the variables x^^^, it follows 
from the lemma above that with probability one, b^i,^ ^ 0. 

Let 

df{xN) 



9{xn) 



dxN 



be the partial derivative of /(xjv) with respect to xm- Then 



Pr {\f{xi,X2, ...,a;Ar_i,X7v)| < 5} 
= Pr {|/(xjv)| <(5,|.9(xjv)| ><5i/2} 
+ Pr {|/(x^)| <<5,|g(xAr)| <5i/2} 

< Pr {|/(xAr)| <(5,|5(xjv)| ><5i/2} 

+ Pr{|5(xjv)| <<5i/2}. 



(67) 



(68) 




Fig. 9. f(x) in a region Ri 



Let us consider the first term on the RHS. The region 

TZ :— {|/(xjv)| < <>,\g{xN)\ > S^^^} is described by two 
conditions |/(a;jv)| < S and |5(a;Ar)| > S-^^^. It is shown 
in Lemma 12.91 that the set of all values of xjv satisfying 
both conditions can be expressed as the union of L pairwise- 
disjoint intervals Ri,i = 1,2,...,L with L < 2dN. Now 
Pr(a;Ar e R) = Pr(xiv £ Rz)- We will now proceed 

to upper-bound the probability x^r £ i?^. To do so, consider 
Fig.|9] Let Axn be the width of the interval Ri and A/(.t„) be 
the height (equal to the difference in maximum and minimum 
values of /(.) in Ri). Since the slope of the curve g{x) is 
greater than S^^^ throughout Ri, we have that 



Aa 



For our purposes, we can assume without loss of generality 
that 



which gives us 



A/(a:„) 



Aa::„ < 



> 6'^\ 

(51/2 



However in any contiguous region, A/(a;„) < 26. This 
implies that 

26 



Since x„ is a 7V(0, 1) random variable, we have that 
Pr{Ax„ < a} < ca, where c is the maximum value of the 
gaussian pdf. 
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Therefore, 



{x e i?,} C {x„ < 25^/2} 
Pr{x e Ri} < Pr{x„ < 2S^^^} 



Using 



Pr{xGi?} = ^Pr{xei?,}, 

i=l 

we obtain 

Pr{|/(xAr)| <<5,|,9(x^)| ><5i/2} < L2c5'/^ ^CS'^'. 

(69) 

Plugging ( |69] l into (l68T l yields 

Pr{|/(xAr)| <5} < C<5l/2+Pr{|5(xAr)| <5l/2|^ 

(70) 

Since (/(x) is of lower degree than f{x), the process can 
be continued to yield 

Pr {|/(xAr)| < <5} < Cd'/^ + C5^/^ + ... + C6^/^"''~' 
+ Pr{rf^!6rf„<5i/2'"}. (71) 

Only the last term involving is a function of the 
remaining variables xi,X2, ■■■,xn~i- We next substitute ( TTTT i 
into ( |66l ) and take the expectation over the remaining variables. 
This yields 

Pr{|/(xi,X2,...,XAr)| <<5} (72) 
< C5i/2 + C(5i/* + ... + C<5i/2'"" 

+IExi,X2,...,X«_i brf„<51/2''«}- (73) 

The last term is identical to that in the right hand side of 
( |66] l, except that the polynomial dAr! bdj^ involves (TV — 1) 
or fewer variables and hence this procedure can be continued. 
Eventually, we will be left with the probability that a constant 
coefficient J is greater than S^^^ for some integer s. Choosing 
the constant K appearing in the statement of the lemma 
to equal J^, we obtain that this probability is equal to the 
probability that K < S. But by hypotheses, K > S and hence 
this probability is equal to zero. This allows us to rewrite the 
bound on probability appearing in ( |73] | as 

Pr{|/(xi,X2, ...,XAr)| < 6} 

for a suitable constant Ci and some integer e. 

Choosing K < 1 forces 6 < I since by hypotheses, 5 < K. 
In this case 

d'/^ + 6'/^ + ... + d'/^° < e6'/^' 

With A := eCi and d := 2", we get 

Pr{|/(xi,X2,...,x^)| < (5} < AS^/'' (74) 

as desired. 



Appendix V 
Proof of LemmaIO] 



Let X, y, H denote the concatenated input and 
output vectors and channel matrix respectively, i.e., 

X = [xi,x2, . . . ,XM]^,y = [yi,y2, ■ • ■ ,yM] and 



H 



Hi 



H, 



H 



M 



(75) 



Then the input-output relation of the parallel channel is 
given by y = Hx + w. We now proceed to determine the 
probability of outage. We have: 



/(x;y|H = i/) 

= /i(y|H = i/)-^My,|yr\x,H = i/) 



M 



i=l 
M 



= /i(y|H = iJ)-^My^lxz,H = i/) 

i=l 

M M 

< ^/i(y,|H = i7)-^/i(y,lx„H = i/) 

1=1 i—l 
M 

= ^[My,|H = if)-My,|x„H = ff)] 

i=l 
M 

i=l 
M 

= ^/(x,;y,IH, = FO. (76) 



Pr{/(x;y |H = i/)<rlogp} 



M 



> Pr{^/(x,;y,|H, ==if,)<''logp} 



1=1 



Equality in the equation above holds if all the x^ are 
independent. So we will choose the x^ to be independent, 
for the rest of the discussion, since this maximizes the mu- 
tual information and hence minimizes the outage probability. 
Define Z; := /(xj;yi|Hi = Hi). Thus is a function of 
the channel realization Hi and is therefore a random variable. 
Since {Hi} are independent by the hypothesis of the lemma 
and Xi are independent by the argument above, {Z^} are also 
independent. 

Let i? = r log(p) and i?j = n log(p) for i = 1, 2, . . . , M. 
Our next goal is to evaluate Pr{ ^« — ^^'^sip) }■ To do 

this, we first consider the case when AI = 2 and we evaluate 
Pr{ Zi + Z2 < r log(p) }. Then we extend this to general M 
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by induction. We define 



N 



FzAR^) Pr{Z, <i?,J 



Let FzAR^) 
Then fzXRi) 



'di{ri 



dr, log(p) 



By Varadhan's Lemma [47], the SNR exponent of the 
integral is given by 



d{r) = \ai di{ri) + d2{r - ri) 

■ri >0 

2 

= , inf Vc?^(r-i). 

(i"l,r2): ri+r2=r — ' 

2 — 1 

Proceeding by induction, we get for the general case with M 
parallel channels where 

M 

p-dir) ^ Pr{^Z,<rlog(p) } 



that 



M 



d{r) = inf "S^ di{ri). 



Appendix VI 
Proof of Lemma [278] 



Following the same line of reasoning as in the proof of 
Lemma [231 we choose Xj to be independent. For computing 
the DMT, we know from Lemma [Z4| that the inputs can in fact 
be independent and identically distributed with a C7V(0, /) 
distribution. So we have 

/(x;y|H = ff) 

M 



i=l 



Pr{/(x;y|H = i/) <rlogp} 



M 



Pr{Y^I{^,;y,\U, ^ Hi) < r\ogp} 



The last equation follows since di(ri) is a decreasing function 
making — ^ positive. 

Pr(Zi + Z2<R) = p-'^^") 

= / fzARi)FzAR~Ri)dRi 

Jo 



= Pr{^ n,/(x,;y,|H, = H,) < rlogp} 



Now, define Zi := 7ii/(xi; yi|Hi = Hi). Also let 
p-'''^'-^ = Pr{Z, <rlog(p)} 

- Pr{/(x,;y,|H, = H,) < (^^^ log(p)} 

where, = Pr{/(x,; y,|H, = H,) < rlog(p)}. 

Using the same convolution argument in the proof of 
Lemma 12.51 

N 

dir) = inf ^d;(r) 

(ri.rs,--- .rN): 2l,i=i '■i='' i^i 
N 

= inf d 

AT 

= inf di{ri). 

(ri,r2,--- .rN): Z^i^i nin=r ,-_-|^ 
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